Lattice-based search strategies for large vocabulary speech recognition

نویسندگان

  • F. Richardson
  • Mari Ostendorf
  • Jan Robin Rohlicek
چکیده

iii Acknowledgments I would like to rst acknowledge my advisor, Prof. Mari Ostendorf, for her constant support and guidance through the research and writing which constitute this thesis. It is a great understatement to say that this work would not have been possible without her help. I would also like to acknowledge my readers, Dr. Robin Rohlicek, Rich Schwartz and Prof. David Castanon, for their insightful comments and helpful discussions. I would like to thank my lab cohorts, Sanjay, Rukmini and Owen, for there help and tolerance throughout this thesis process. In particular, I would like to thank Owen for helping me understand the detailed functions of the original BU recognition system which he built from scratch. I would also like to thank my family who has been extremely supportive throughout the time I've been working on this thesis. My brother Jon was always a phone call away and my parents were always there when I needed them.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

Minimum Bayes Risk Estimation and Decoding in Large Vocabulary Continuous Speech Recognition

Minimum risk estimation and decoding strategies based on lattice segmentation techniques can be used to refine large vocabulary continuous speech recognition systems through the estimation of the parameters of the underlying hidden Mark models and through the identification of smaller recognition tasks which provides the opportunity to incorporate novel modeling and decoding procedures in LVCSR...

متن کامل

Improved search strategy for large vocabulary continuous Mandarin speech recognition

This paper presents a new search strategy for large vocabulary continuous Mandarin speech recognition considering the special structure of Chinese language. This strategy is composed of a forward and a backward passes, between which a high-quality syllable lattice is generated to bridge the syllable-level and word-level decoding processes. In the forward pass, considering the small number of sy...

متن کامل

Implementation Aspects Of Large Vocabulary Recognition Based On Intraword And Interword Phonetic Units

Most large vocabulary speech recognition systems essentially consist of a training algorithm and a recognition structure which is essentially a search for the best path through a rather large decoding network. Although the performance of the recognizer is crucially tied to the details of the training procedure, it is absolutely essential that the recognition structure be efficient in terms of c...

متن کامل

Effect of task complexity on search strategies for the motorola lexicus continuous speech recognition system

As speech recognition systems are increasingly applied to real world problems, it is often desirable to use the same recognition engine for a variety of tasks of differing complexity. For example the recognizer in a dictation system may need to handle a highly constrained correction grammar, as well as a large vocabulary dictation trigram. This paper explores the relationship between the comple...

متن کامل

Approximate inference: A sampling based modeling technique to capture complex dependencies in a language model

In this paper, we present strategies to incorporate long context information directly during the first pass decoding and also for the second pass lattice re-scoring in speech recognition systems. Long-span language models that capture complex syntactic and/or semantic information are seldom used in the first pass of large vocabulary continuous speech recognition systems due to the prohibitive i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995